Using TEI, CMDI and ISOcat in CLARIN-DK
نویسندگان
چکیده
This paper presents the challenges and issues encountered in the conversion of TEI header metadata into the CMDI format. The work is carried out in the Danish research infrastructure, CLARIN-DK, in order to enable the exchange of language resources nationally as well as internationally, in particular with other partners of CLARIN ERIC. The paper describes the task of converting an existing TEI specification applied to all the text resources deposited in DK-CLARIN. During the task we have tried to reuse and share CMDI profiles and components in the CLARIN Component Registry, as well as linking the CMDI components and elements to the relevant data categories in the ISOcat Data Category Registry. The conversion of the existing metadata into the CMDI format turned out not to be a trivial task and the experience and insights gained from this work have resulted in a proposal for a work flow for future use. We also present a core TEI header metadata set.
منابع مشابه
Semantic metadata mapping in practice: the Virtual Language Observatory
In this paper we present the Virtual Language Observatory (VLO), a metadata-based portal for language resources. It is completely based on the Component Metadata (CMDI) and ISOcat standards. This approach allows for the use of heterogeneous metadata schemas while maintaining the semantic compatibility. We describe the metadata harvesting process, based on OAI-PMH, and the conversion from severa...
متن کاملExperiences with the ISOcat Data Category Registry
The ISOcat Data Category Registry has been a joint project of both ISO TC 37 and the European CLARIN infrastructure. In this paper the experiences of using ISOcat in CLARIN are described and evaluated. This evaluation clarifies the requirements of CLARIN with regard to a semantic registry to support its semantic interoperability needs. A simpler model based on concepts instead of data categorie...
متن کاملCLARIN Concept Registry: The New Semantic Registry
The CLARIN Concept Registry (clarin.eu/conceptregistry) is the place in the CLARIN Infrastructure where common and shared semantics of, but not limited to, linguistic concepts are defined. This is important to achieve semantic interoperability, and to overcome to a degree the diversity in data structures, either in metadata or linguistic resources, encountered within the infrastructure. Whereas...
متن کاملCreating & Testing CLARIN Metadata Components
The CLARIN Metadata Infrastructure (CMDI) that is being developed in CLARIN (Common Language Resources and Technology Infrastructure) is a computer-supported framework that combines a flexible component approach with the explicit declaration of semantics. The goal of the Dutch CLARIN project “Creating & Testing CLARIN Metadata Components” is to create metadata components and profiles for a wide...
متن کاملThe CMDI MI Search Engine: Access to Language Resources and Tools Using Heterogeneous Metadata Schemas
The CLARIN Metadata Infrastructure (CMDI) provides a solution for access to different types of language resources and tools across Europe. Researchers have different research data and tools, which are large-scale and described differently with domain-specific metadata. In the context of the Search & Develop (S&D) project at the Meertens Institute within CLARIN, we present a system description o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014